Multi-task learning (MTL) models have demonstrated impressive results in computer vision, natural language processing, and recommender systems. Even though many approaches have been proposed, how well these approaches balance different tasks on each parameter still remains unclear. In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter. Specifically, we compute the total updates by the exponentially decaying Average of the squared Updates (AU) on a parameter from the corresponding task.Based on this novel metric, we observe that many parameters in existing MTL methods, especially those in the higher shared layers, are still dominated by one or several tasks. The dominance of AU is mainly due to the dominance of accumulative gradients from one or several tasks. Motivated by this, we propose a Task-wise Adaptive learning rate approach, AdaTask in short, to separate the \emph{accumulative gradients} and hence the learning rate of each task for each parameter in adaptive learning rate approaches (e.g., AdaGrad, RMSProp, and Adam). Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. Analysis on both synthetic and real-world datasets shows AdaTask balance parameters in every shared layer well.
translated by 谷歌翻译
基于几何点云压缩(G-PCC)可以为点云实现显着的压缩效率。但是,它仍然导致严重的属性压缩伪影,尤其是在低比特率方案下。在本文中,我们提出了一个多尺度图注意网络(MS-GAT),以删除由G-PCC压缩的点云属性的伪影。我们首先构建基于点云几何坐标的图形,然后使用Chebyshev Graph卷曲来提取点云属性的特征。考虑到一个点可以与离IT附近和远离它的点来相关,我们提出了一种多尺度方案来捕获当前点与其相邻和远处的远程之间的短距离和长距离相关性。为了解决各种点可能具有由自适应量化引起的不同程度的不同程度的问题,我们将量化步骤介绍为对所提出的网络的额外输入。我们还将图形注意力层纳入网络中,以特别关注具有更多属性工件的点。据我们所知,这是G-PCC的第一个属性伪影删除方法。我们在各种点云上验证了我们方法的有效性。实验结果表明,我们的提出方法平均降低了9.28%的BD速率。此外,我们的方法可以实现下游点云语义分割任务的一些性能改进。
translated by 谷歌翻译
我们地址结束学习视频压缩,特别关注更好地学习和利用时间上下文。对于时间上下文挖掘,我们建议不仅存储先前重建的帧,还可以存储到广义解码图像缓冲器中的传播功能。从存储的传播功能中,我们建议学习多尺度的时间上下文,并将学习的时间上下文重新填充到压缩方案的模块中,包括上下文编码器 - 解码器,帧生成器和时间上下文编码器。我们的计划丢弃了并行化 - 不友好的自动回归熵模型,以追求更实用的解码时间。我们将我们的计划与X264和X265(分别代表H.264和H.265的工业软件)以及H.264,H.265和H.266(JM,HM和VTM的官方参考软件(JM,HM和VTM)进行比较, 分别)。当周期为32次并定向为PSNR时,我们的方案优于H.265 - HM以14.4%的比特率储蓄;当取向MS-SSIM时,我们的方案优于21.1%比特率保存的H.266 - VTM。
translated by 谷歌翻译
认知诊断,其目标是获得学生对特定知识概念的熟练程度,是智能教育系统中的基本任务。以前的作品通常代表每个学生作为培训知识熟练程度,无法捕捉学生的概念和基本概况(例如记忆或理解)的关系。在本文中,我们提出了一种与探索知识概念和学生嵌入的分层关系的学生代表方法。具体而言,由于父母知识概念的熟练程度反映了知识概念之间的相关性,因此我们获得了第一个知识熟练掌握了父子概念投影层。此外,采用低维密度载体作为每个学生的嵌入,并获得完整的连接层的第二个知识熟练程度。然后,我们将上面的两个熟练程度传染媒介结合起来获得学生的最终代表。实验表明了所提出的代表方法的有效性。
translated by 谷歌翻译
教育人工智能旨在在教育领域的利润任务,如智能测试纸和整合练习,其中主要技术背后是如何匹配练习,称为查找类似练习(FSE)问题。这些方法中的大多数都强调了他们的模型能力来代表这项运动,遗憾的是,仍存在许多挑战,例如数据稀缺,对练习和高标签噪音的理解不足。我们发布了一款中国教育预先训练的语言模型BERT $ _ {EDU} $的标签 - 稀缺数据集,并介绍了策略的正常化,以克服数学公式的多样性和运动中的术语。我们以创新的方式发现新的辅助任务取决于解决问题的思想,并提出了一个非常有效的Moe增强的FSE任务的多任务模型,以实现更好地了解练习。此外,信心学习被利用在标签数据中克服火车集并克服高噪声。实验表明,本文提出的这些方法非常有效。
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译